Amelia McNamara
August 18, 2016
There are three main types of geographic data:
People often also talk about images as geographic data, because so many maps come that way (tiles from googlemaps and OpenStreetMap, satellite imagery). So, we could also think about the difference between
Mapbox has a nice explanation of how map tiles work. Let’s check it out!
Leaflet is a JavaScript mapping library.
This means it plays nicer with things like d3, but we can use it in R as well, through an htmlwidget.
We’re going to be doing a lot of the stuff from the RStudio leaflet tutorial.
library(leaflet)m <- leaflet() %>%
setView(-93.2650, 44.9778, zoom = 4) %>%
addProviderTiles("Stamen.Toner") %>% # Add default OpenStreetMap map tiles
addMarkers(lng=-93.2650, lat=44.9778, popup="Minneapolis!")
mLets look at storm data from the NOAA. It comes in a few files that we need to join together in order to use.
library(readr)
library(dplyr)
library(RCurl)
stormlocs <- getURL("https://raw.githubusercontent.com/AmeliaMN/SummerDataViz/master/Geo/StormEvents_locations-ftp_v1.0_d2016_c20160810.csv")
stormlocs <- read_csv(stormlocs)
stormdetails <- getURL("https://raw.githubusercontent.com/AmeliaMN/SummerDataViz/master/Geo/StormEvents_details-ftp_v1.0_d2016_c20160810.csv")
stormdetails <- read_csv(stormdetails)
#stormlocs <- read_csv("StormEvents_locations-ftp_v1.0_d2016_c20160810.csv")
#stormdetails <- read_csv("StormEvents_details-ftp_v1.0_d2016_c20160810.csv")
stormlocs <- stormlocs %>%
left_join(stormdetails, by="EVENT_ID")
lightning <- stormlocs %>%
filter(EVENT_TYPE=="Lightning")Now, we can programmatically map them.
m <- leaflet(data=lightning) %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addMarkers(~LONGITUDE, ~LATITUDE)
mBonus– add popups!
tornados <- stormlocs %>%
filter(EVENT_TYPE=="Tornado")m <- leaflet(data=tornados) %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addCircles(~LONGITUDE, ~LATITUDE, weight = 1, radius = ~DAMAGE_PROPERTY, popup = ~EVENT_NARRATIVE)
mMost of the time, you need to aggregate your data in some way. Think about states, countries, zipcodes, counties, etc. Data often comes pre-aggregated at a particular spatial aggregation, or you can’t show all the individual points.
Most boundaries (state, national, etc) are provided in terms of polygons. Major mapping software ArcGIS, from ESRI, has essentially set the standard formats.
Look in the GitHub repo to see some examples of how this looks. There are many files with different extensions: .prj (the projection), .shp (the shapefile), .cpg (??), .dbf (??), .shx (??).
You need special software or packages to work with shapefiles.
I got these from the Census. You can choose the resolution.
library(rgdal)
states <- readOGR("cb_2015_us_state_500k", layer = "cb_2015_us_state_500k", verbose = FALSE)tornadocount <- tornados %>%
group_by(STATE) %>%
summarize(n=n())colors <- c("#edf8fb", "#b2e2e2", "#66c2a4", "#238b45")
tornadocount <- tornadocount %>%
mutate(color = cut(n, breaks=quantile(n)))
# Baaaaad factor practice. Do as I say, not as I do?!
levels(tornadocount$color) <- colorsstates@data <- left_join(states@data, tornadocount, by=c("NAME"="STATE"))# Not currently working
leaflet(data=states) %>%
addTiles() %>%
addPolygons(stroke = FALSE, fillOpacity = 0.5, smoothFactor = 0.5, color = ~states@data$color)Another way to do mapmaking in R is with the ggmap package. Let’s try that together!
library(ggmap)I have a bunch of spatial stuff in this GitHub repo, so we can check some of that out.
My colleague Aran Lunzer and I made a demo of how spatial aggregation can impact the visual pattern you see. Let’s go check it out!